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We previously developed a polymorphic type system and a type checker for a multithreaded lock- 
based polymorphic typed assembly language (MIL) that ensures that well-typed programs do not 
encounter race conditions. This paper extends such work by taking into consideration deadlocks. The 
extended type system verifies that locks are acquired in the proper order. Towards this end we require 
a language with annotations that specify the locking order. Rather than asking the programmer (or the 
compiler's backend) to specifically annotate each newly introduced lock, we present an algorithm to 
infer the annotations. The result is a type checker whose input language is non-decorated as before, 
but that further checks that programs are exempt from deadlocks. 

1 Introduction 

Type systems for lock-based race and deadlock static detection try to contradict the idea put forward by 
some authors that "the association between locks and data is established mostly by convention" 03). 
Despite all the pathologies usually associated with locks (in the aforementioned article and others), and 
specially at system's level, locks are here to stay fSl . 

Deadlock detection should be addressed at the appropriate level of abstraction, for, in general, com- 
piled code that does not deadlock allows us to conclude nothing of the source code. Nevertheless, the 
problem remains valid at the assembly level and fits quite nicely in the philosophy of typed assembly 
languages [14]. By capturing a wider set of semantic properties, including the absence of deadlocks, we 
improve compiler certification in systems where code must be checked for safety before execution, in 
particular those with untrusted or malicious components. 

Our language targets a shared-memory machine featuring an array of processors and a thread pool 
common to all processors iPTOl [T71 . The thread pool holds threads for which no processor is available, 
a scheduler chooses a thread from this pool should a processor become idle. Threads voluntary re- 
lease processors — our model fits in the cooperative multi-threading category. For increased flexibility 
(and unlike many other models, including lfl2ll ) we allow forking threads that hold locks, hence we al- 
low the suspension of processes while in critical regions. A prototype implementation can be found at 
http://gloss.di.fc.ul.pt/mil 

The code in Figure [Qpresents a typical example of a potential deadlock comprising a cycle of threads 
where each thread requests a lock hold by the next thread. Imagine the code running on a two-processors 
machine: after main completes its execution, each philosopher embarks on a busy-waiting loop, only 
that two of them will be running in processors, while the third is (and will indefinitely remain) in the 
run-pool. Situations of deadlocks comprising suspended code are known to be difficult to deal with lfl3ll . 
Our notion of deadlocked state takes into account running and suspended threads. 

Another source of difficulties in characterizing deadlock states derives from the low-level nature of 
our language that decouples the action of lock acquisition from that of entering a critical section, and 
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main () { 

fl,r3 := newLock; f3,r5 := newLock; f2,r^ := newLock 3 forks 

r±:= r3; r 2 := r^; fork liftLeftFork[fl,f2] 1st philosopher 

ri - .= r^, r 2 := rs', fork liftLeftFork[f2,f3] 2nd philosopher 

r±:= r 2 := fork liftLeftFork[f3,fl] 3rd philosopher 

done 

} 

liftLeftFork V[l,m].(ri:(l)', r 2 :(m) m ) { 
r3:= testSetLock ri 
if r3= jump liftRightFork[l,m] 
jump liftLeftFork[l,m] 

} 

liftRightFork VpH-O^v 1 )'. r 2 :(m) m ) requires {1} { 
ry.= testSetLock r 2 
if r3= jump eat[l,m] 
jump liftRightFork[l,m] 

} 

eat V[l,m].(ri:(l)', r 2 :(m) m ) requires {l,m} { 
eat 

unlock ri lay down the left fork 

unlock r 2 lay down the right fork 

think 

jump liftLeftFork[l,m] 

} 

Figure 1 : The dining philosophers written in MIL 

that features non-blocking instructions only. As such the meaning of "entering a critical section" cannot 
be of a syntactic nature. 

A characteristic of our machine is the syntactic dissociation of the test-and-set-lock and the jump- 
to-critical operations, for which we provide two distinct instructions, as found in conventional instruc- 
tion sets. Furthermore, there is no syntactic distinction between a conventional conditional jump and a 
(conditional) jump-to-critical instruction, and the test-set-lock and jump-to-critical instructions can be 
separated by arbitrary assembly code. As far as the type system goes, the thread holds the lock only after 
the conditional jump, even though at runtime it may have been obtained long before. 

The main contribuitons of this paper are: 

• A type system for deadlock elimination. We devise a type system that establishes a strict par- 
tial order on lock acquisition, hence enforcing that well typed MIL programs do not deadlock — 
Theorem HI 

• An algorithm for automatic program annotation. In order to check the absence of deadlock, MIL 
programs must be annotated to reflect the order by which locks must be acquired. Annotating large 
assembly programs, either manually or as the result of a compilation process, is not plausible. 
We present an algorithm that takes a plain MIL program and produces an annotated program 
together with a collection of constraints over lock sets that are passed to a constraint solver. In 
case the constraints are solvable the annotated program is typeble — Theorem [8] — hence free from 
deadlocks. 

The outline of this paper is as follows. The next section introduces the syntax of programs and 
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registers 


r : 


:= ri | ... | r R 


lock values 


b : 


:= | 1 | A 


values 


v : 


:= r | n \ b \ I \ v[X] | ?T 


instructions 


l : 


:= 


control flow 




r:=v r:=r + v ifr = vjumpv fork v 


memory 




r:=malloc[f] r := v[n] r[n]:=v 


locking 




A: (A, A),r := newLock r := testSetLock 


inst. sequences 


/ : 


:= l;I jump v done 


types 


T : 


:= int | A | (f) A | T requires A | V[A: (A, A)] 


register file types 


r : 


• — f 1 • ^1 > ■ • • ! • tn 


permissions 


A : 


:= Xi,...,Xn 


heaps 


H : 


:= {h: hi,...,l n : h n } 


heap values 


h : 


:= (vi...v„)* | t{/} 


thread pool 


T : 


:= {{h\l 1 ],R 1 ),...,(l n [l n ],R H )} 


register files 


R : 


:= {n: vi,...,r R : v R } 


processors array 


P : 


:= {1: pi,... ,N: ;? N } 


processor 


P ■ 


:= (*;A;7) 


states 


S : 


:= (H;T;P) \ halt 



Figure 2: Syntax. 



machine states, together with the running example. Then Section [3] presents the operational semantics 
and the notion of deadlocked states. Section|4]describes the type system and the first main result, typable 
states do not deadlock. Section [5] introduces the annotation algorithm and the second main result, the 
correctness of the algorithm with respect to the type system. Finally, Section [6] describes related work 
and concludes the paper. 



2 Syntax 

The syntax of our language is generated by the grammar in Figure [2] We rely on two mutually disjoint 
sets for heap labels, ranged over by /, and for singleton lock types, ranged over by X. Letter n ranges 
over integer values. 

Values v comprise registers r, integer values n, lock values b, labels /, type application v[A], and 
uninitialised values ?t. Lock value represents an open lock, whereas lock value 1 denotes a closed 
lock; the A annotation in O' 1 allows to determine the lock guarding the critical section a processor is 
trying to enter and will be useful when defining deadlocked states. Lock values are runtime entities, they 
need to be distinct from conventional integer values for typing purposes only. Labels are used as heap 
addresses. Uninitialised values represent meaningless data of a certain type. 

Most of the machine instructions i presented in Figure[2]are standard in assembly languages. Distinct 
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in MIL are the instructions for creating new threads — fork places in the run queue a new thread waiting 
for execution — , for allocating memory — mallocfti , . . . , T„] allocates a tuple in the heap protected by 
lock A and comprising n cells each of which containing an uninitialised value of type T,- — , and for 
manipulating locks. In this last group one finds newLock to create a lock in the heap and store its address 
in register r (A describes the singleton lock type associated to the new lock, further described below), 
testSetLock to acquire a lock, and unlock to release a lock. 

Instructions are organised in sequences /, ending in jump or in done. Instruction done terminates a 
thread, voluntarily releasing the core, giving rise to a cooperative multi-threading model of computation. 

Types T include the integer type int, the singleton lock type A, the tuple type (f)' 1 describing a tuple 
in the heap protected by lock A, and the code type V[A : (A, A)].(r requires A) representing a code block 
abstracted on singleton lock types A , expecting registers of the types in T and requiring locks as in A. 
Each universal variable is bound by two sets of singleton lock types A, used for deadlock prevention, as 
described below. For simplicity we allow polymorphism over singleton lock types only; for abstraction 
over arbitrary types see ifTUl . 

The abstract machine is parametric on the number of available processors N, and on the number 
of registers per processor R. An abstract machine can be in two possible states S: halted or run- 
ning. A running machine comprises a heap H, a thread pool T, and an array of processors P of fixed 
length N. Heaps are maps from labels / into heap values h that may be either data tuples or code 
blocks. Tuples (vi,...,v B ) are vectors of mutable values v,- protected by some lock A. Code blocks 
V[A : (A,A)].(r requires A){/} comprise a signature (a code type) and an instruction sequence /, to be 
executed by a processor. A thread pool T is a multiset of pairs (l[X],R), each of which contains the 
address (a label) of a code block in the heap, a sequence of singleton lock types to act as arguments to 
the forall type of the code block, and a register file. A processor array P contains N processors, each of 
which is composed of a register file R mapping the processor's registers to values, a set of locks A (the 
locks held by the thread running at the processor, often call the thread's permission), and a sequence of 
instructions / (the instructions that remain to execute). 

Lock order annotations Deadlocks are usually prevented by imposing a strict partial order on locks, 
and by respecting this order when acquiring locks j4l[9j[T2]]. The syntax in Figure [2] introduces annota- 
tions that specify the locking order. When creating a new lock, we declare the order between the newly 
introduced singleton lock type and the locks known to the program. We use the notation A : (Ai,A2) 
to mean that lock type A is greater than all lock types in set Ai and smaller than each lock type in set 
A2. The annotated syntax differs from the original syntax ( |[T0l [TTl ) in two places: at lock creation 
A: (A,A),r := newLock; and in universal types V[A: (A,A)].T, where we explicitly specify the lock 
order on newly introduced singleton lock types. 

Example FigureQ]shows an example of a non-annotated program. Annotating such a program requires 
describing the order for each lock introduced in code block main, say, 

fl::({},{}), r 3 := new/Lock 
f3::({fl},{}), r 5 := newLock 
f2::({fl},{f3}), r 4 := newLock 

and at the types for the three code blocks below. 

liftLeftFork V[I::({},{})].V [m: : ({l},{})].(r i: (l)', r 2 :(m> m ) 
liftRightForkVII^II.IDl.VImi^ll^Dl.CniO) 1 , r 2 :(m) m ) requires {1} 
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P(i) = (_;_;done) 



V/.P(z) = (_;_;done) 
(_;0;P) ->■ halt 

//(/)= V[A: (_,_)].(_ requires A){/} 



(R-HALT) 
(R-SCHEDULE) 

(R-FORK) 
(R-NEWLOCK) 

(R-TSL 0) 

(R-TSL 1) 
(R-UNLOCK) 



(H;T\t){{l[V],R)};P) -> (H;T;P{i: (P;A;/) [A'/A]}) 

P(i) = (^AMA^forkv;/)) R(v) = /[A] //(/)= V[_].(_ requires A'){_} 
(#;r;P) -> (// ; ru{(Z[A],P)};P{/: (P;A;/)}} 

P(i) = (/J; A; (A :(_,_), r := newLock;/)) /0dom(#) A' fresh 
(H;T;P)^{H{1: {0)^'};T;P{i: (R{r: /}; A;/[A'/A])}> 

P(i) = (P;A; (r := testSetLock v;I)) R(v) = I H(l) = (0)* 
(H;T;P)^{H{1: {l)*};T;P{i: (R{r: A }; A W {A};/)}) 

P(j) = (P; A; (r := testSetLock v;/)) //(R(v)) = A ^ A 

(//;P;P) ^ (H;T;P{i: (P{r: l};A;/)}> 

P(j) = (P;AW {A}; (unlock v;/)) R(v) = / H(l) = {_) x 
{H;T;P) ^ (H{1: (0) A };r;P{/: (P;A;/)}) 

Figure 3: Operational semantics (thread pool and locks). 



eat V[l::({} 1 {})].V[m::({l}.{»].(n:<l) 1 , r 2 :{m) m ) requires {l.m} 

Notice that abstracting one lock at a time, as in the types just shown, precludes declaring code blocks 
with non-strict partial orders on locks, such as V[Z : (0, {m}), m : ({/}, 0)] .t, which cannot be fulfilled by 
any conceivable sequence of instructions. 

3 Operational Semantics and Deadlocked States 

The operational semantics is defined in Figures [3] and [4] The scheduling model of our machine is de- 
scribed by the first three rules in Figure [3] The machine halts when all processors are idle and the thread 
pool is empty (rule R-HALT). An idle processor (a processor that executes instruction done) picks up an 
arbitrary thread from the thread pool and activates it (rule R-SCHEDULE); the argument locks A' replace 
the parameters A in the code for the processor. For a fork instruction, the machine creates a "closure" by 
putting together the code label plus its arguments, /[A], and a copy of the registers, P, and by placing it 
in the thread pool. The thread permission is partitioned in two: one part (A) stays with the thread, the 
other (A') goes with the newly created thread, as required by the type of its code. 

Some rules rely on the evaluation function R that looks for values in registers and in value application. 



R(v) 



P(v) if v is a register 
R(v')[A] ifvisv'[A] 
v otherwise 



In our model the heap tuple (0)^ represents an open lock, whereas represents a closed lock. A 
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P(i) = (R;A;(r:= malloc l<£dom(H) 
(H;T;P) (H{1: {lx)^};T;P{i: {R{r: Z};A;/}}} 

P(i) = (R;A;(r:=v[n];I)) H(R(v)) = {v 1 ..v n ..v n+m ) k A € A 
(H;T;P) -> (H;T;P{i: (R{r: v„};A;/)}) 

P(i) = {R;A;(r[n] :=v;/)) P(r) = / //(/) = (vi..v„..v„ +m ) A A € A 
(H;T;P)^(H{1: (v l ..R(v)..v n+m ^};T;P{i: (R;A;I)}} 

P(/) = (/?;A;jumpv) R(v) = Z[A] H(l) = V[A': (.,.)]..{/} 
(//;r;P) -> {H;T;P{i: (R;A;I[1 /!'})}) 

P(i) = {R;A; (r:=v;/)) 

(tf;r;P) -)■ (H;T;P{i: (R{r: R(v)};A;/)}> 

P(i) = {R;A; { r ;= r ' +y;I)) 

(H;T;P) -> (H;T;P{i: (R{r: R(r') +R(v)};A;/)}} 

P(Q = (P;A;(ifr = vjumpv';_)) P(r)=v R(v / ) = /[X] //(/)= V[A' 

(//;r;P) ^ <tf;r;P{/: (P;A;/[X/X'])}) 

P(Q = (/?; A; (if r = v jump _;/)) P(r) ^ v 
(H;T;P)^(H;T;P{i: (R;A;I)}) 

Figure 4: Operational semantics (memory and control flow). 

lock is an uni-dimensional tuple holding a lock value because the machine provides for tuple allocation 
only; lock A is used for type safety purposes, just like all other singleton lock types. Instruction newLock 
creates a new open lock in the heap and places a reference / to it in register r. Instruction testSetLock 
loads the contents of the lock tuple into register r and sets the heap value to (1) , it also makes sure that 
the lock is not in the thread's permission (rules R-TSLO and R-TSLl). Further, applying the instruction to 
an unlocked lock adds lock A to the permission of the processor (rule R-TSLO). Locks are waved using 
instruction unlock, as long as the thread holds the lock (rule R-UNLOCK). 

Rules related to memory manipulation are described in Figure H] Rule R-MALLOC creates an heap- 
allocated A -protected uninitialised tuple and moves its address to register r. To store values in, and load 
from, a tuple we require that the lock that guards the tuple is among the processor's permission. In rules 
R-BRANCHT and R-BRANCHF, we ignore the lock annotation on lock values, so that is considered 
equal to 0. The remaining rules are standard (cf [14]). 

Deadlocked States The difficulty in characterising deadlock states stems from the fact that processors 
never block and that threads may become (voluntary) suspended while in critical a region. We aim at 
capturing conventional techniques for acquiring locks, namely busy-waiting and sleep-lock [17]. To- 
wards this end, we need to restrict reduction of a given state S to that of a single processor in order to 
control the progress of a single core: let relation 5 — s-, S' denote a reduction step on processor i excluding 



(R-MALLOC) 
(R-LOAD) 



(R-JUMP) 
(R-MOVE) 
(R-ARITH) 
(-,-)]-{'} 



(R-BRANCHT) 

(R-branchF) 
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VhA:(Ai,_) A1GA1 »FhA:(_,A 2 ) A 2 G A 2 »FhAi^A 2 ^hA 2 ^A 3 



^ h Ai ^ A ^ h A ^ A 2 

^hA^A ••• *F h A„ ^ A ^hA^Aj 



¥ h Ai -< A 3 
*F h A -< A n 



¥h{Ai,...,A„HA 



¥hA^{Ai,...,AJ 



Figure 5: Less-than relation on locks and permissions 



ftv(r) C dom(»F) 



»FhT 



*P h ri : Ti , . . . , r„_|_ m : T„+ m <: ri : Z\ , . . . , r n : T„ 



(T-type,S-regFile) 



»F,/:T;rh/:T Ti , r,- : T, T 2 h r,- : T 



»F;rh«:int ¥;r h 0,1, A : A *F;r H?t: t 



(T-LABEL,T-REG,T-INT,T-LOCK,T-UNINIT) 

^hA' <P;ri-v: V[A: (Ai,A 2 )]t *P h Ai ^ A' ^ A 2 



V;rhv[A']: t[A'/A] 



(T-valApp) 



Figure 6: Rules for values *P;r h v: T , for subtyping *P h T <: T , and for types *P h t . 



rules R-HALT, R-SCHEDULE and R-UNLOCK. 

Definition 1 (Deadlocked states). Le? 5 be the state (H; T;P). 

• A processor (R;A;I) holds lock A when X £ A; a suspend thread {l[X'],R} holds lock A when 
//(/)= V[A: (_, -)]•(- requires A) {_} and A e A[A'/A]; 

• A processor p in P immediately tries to enter a critical section guarded by lock A if p is of the form 
(R; _; (ifr = jump v; _)) and R(r) = X ; 

• For busy waiting, a thread in processor pi is trying to enter a critical region guarded by A ifS — >•* S' 

and processor pi in state S' immediately tries to enter a critical section guarded by A; 

• For sleep-lock, a thread (/[A '],R) in thread pool T is trying to enter a critical region guarded by A if 
H(l) = V[A : _].(_ requires A) {I}, and the thread in processor p\ of state S+P{\ : (R;A;I) [A'/A]} 
is trying to enter a critical region guarded by A; 

• A state S is deadlocked if there exist locks A), • • • , A„, with A) = A„, and indices do,... ,d n -\ (n > 0) 
such that for each <i <n, either processor p^ or suspended thread tj. holds lock A; and is trying 
to enter a critical region guarded by A ; -+i. 

Notice that J, / dj does not imply p^ ^ p^j and similarly for threads in the thread pool, so that a 
state deadlocked on locks Ao, . . . , A„ may involve less than n threads. We have excluded the R-UNLOCK 
rule from the — reduction relation, yet releasing a lock is not necessarily an indication that the thread 
is leaving a deadlocked state, for the released lock may not be involved in the deadlock; a more general 
definition of deadlocked state would take this fact into account. 
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»P;r;0hclone 



(T-DONE) 



¥;rhv: T requires A ^jTjA'h/ ^hrcf 
*P;r;AWA'hforkv;7 

¥,A: (Ai,A 2 );r{r: (A) A };Ah7 Ag¥,r,A 
*F;T;A h X : (Ai,A 2 ),r := newLock;/ 



(T-NEWLOCK) 



(T-FORK) 



»F;rhv:(A) A »P;r{r: A};Ah7 A A 
»F;r;A h r := testSetLock v;7 

»F;rhv: (X) x f;r;Ah/ 
»P;r;Att){A} h unlock v;7 



(T-UNLOCK) 



(T-TSL) 



^;Thr:A *F;TI-v: T' requires A a {a} »P;r;Ah7 ¥r-r<:r *F h A A 



4 A Type System for Deadlock Prevention 

Type System Typing environments m map heap addresses / to types T, and singleton lock types X to 
lock kinds (Ai,A2). An entry X : (Ai,A2) in *F means that X is larger than all lock types in Ai and 
smaller than any lock type in A 2 , a notion captured by relation -< described in Figure [5] Instructions are 
also checked against a register file type T holding the current types of the registers, and a set A of lock 
variables: the permission of (the processor executing) the code block. The type system is presented in 
Figures [6] to [9] 

Typing rules for values are illustrated in Figure [6] Rule T-TYPE makes sure types are well-formed, 
that all free singleton lock types (or free type variables, ftv) in a type are bound in the typing environment. 
A formula F <: V allows "forgetting" registers in the register file type, and is particularly useful in jump 
instructions where we want the type of the target code block to be more general (ask for less registers) 
than those active in the current code lPT4l . The rule for value application, T-VAlApp, checks that the 
argument X' is within the interval (Ai,A2), as required by the parameter X. 

The rules in Figure [7] capture the policy for lock usage. Rule T-DONE requires the release of all 
locks before terminating the thread. Rule T-FORK splits permissions into sets A and A': the former 
is transferred to the forked thread according to the permissions required by the target code block, the 
latter remains with the processor. Rule T-NEWLOCK assigns a lock type (X) x to the register. The new 
singleton lock type X is recorded in x ¥, so that it may be used in the rest of the instructions 7. Rule 
T-TSL requires that the value under test is a lock in the heap (of type (X) x ) and records the type of the 
lock value X in register r. This rule also disallows testing a lock already held by the processor. Rule 
T-UNLOCK makes sure that only held locks are unlocked. Rule T-CRITICAL ensures that the processor 
holds the permission required by the target code block, including the lock under test. A processor is 
guaranteed to hold the tested lock only after (conditionally) jumping to the critical region. A previous 
test-and-set-lock instructions may have obtained the lock, but the type system records that the processor 



*P;r;A h if r = jump v;7 



(T-CRITICAL) 



Figure 7: Typing rules for instructions (thread pool and locks) ^T; A 1- 7 . 
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^rjr: (f) A };Ah/ g ^ A A £ A 
^^Ahr := malloc [t] a ;7 

*F;Thv: (Ti..T, 1+m ) A ^rjr: T„};Ah/ t„ ^ A' AeA 



(T-MALLOC) 



¥;r;Ahr:=v[n];/ 

Whv:^ ^;rhr: (T!..T n+m ^ ¥ ; r{r: (Ti..T„ +m )*};A h / x n ^ A' AgA 



>F;r;Ahr[/i] :=v;7 
»P;ri-v:T »F;r{r: T};Ah7 
¥;r;Ahr:=v;/ 

^;rhr':int »F;rhv:int »F;r{r: int}; A h 7 
¥;r;Ahr:=r' + v;/ 

»F;rhr:int »F;rhv:int h v: T requires A »F;r;Ah7 

*F;r;A h if r = v jump v;7 

*P;r h v: P requires A ¥hr<:r' 
*F;r;A h jump v 



(T-LOAD) 



(T-STORE) 



(T-MOVE) 
(T-ARITH) 
(T-BRANCH) 
(T-JUMP) 



Figure 8: Typing rules for instructions (memory and control flow) ^T; A h 7 . 



holds the lock only after the conditional jump. The rule checks that the newly acquired lock is larger 
than all locks in the possession of the thread. 

The typing rules for memory and control flow are depicted in Figure [8] Operations for loading 
from (T-LOAD), and for storing into (T-STORE), tuples require that the processor holds the right permis- 
sions (the locks for the tuples it reads from, or writes to). Both rules preclude the direct manipulation of 
lock values by programs, via the % n ^ A' assumptions. 

The rules for typing machine states are illustrated in Figure [9] The rule for a thread item in the thread 
pool checks that the type and required registers 7? are as expected in the type of the code block pointed 
by v. Similarly, the rule for type checking a processor also permits that type T of the registers 7? be more 
specific than the register file type Y' required to type check the remaining instructions 7. The heap value 
rule for code blocks adds to *P each singleton lock type (together with its bounds), so that they may be 
used in the rest of the instructions 7. 



Example As expected, the example is not typable with the annotations introduced previously. The 
three newLock instructions place in *P three entries fa : (0,0), fa', {{fa}, {fa}), fa '■ ({fa},®)- Then 
the value ( I iftLeft Fork[f2] ) [f 1] (in the example: liftLeftFork[fl,f2]) in the first fork instruction issues goals 
*F h -< fa -<% and *P h {fa } -< fa -< 0, which are easy to guarantee given that *P contains an entry 
fa '■ {{fa}, {fa})- Likewise, the second fork instruction, generates goals *F h -< fa -< and *P h {fa} -< 
fa -< 0, which are again hold because of same entry. However, the last fork instruction requires *F h -< 
fa -< and *P h {/ 3 } H fa -< 0, the second of which does not hold. 

Notice however that each of the three jump instructions are typable per se. For example, in code 
block liftRightFork, instruction if r 3 = jump eat[l,m] requires *F h {/} -< m, which holds because the 
signature for the code block includes the annotation m: ({Z},0). 
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Figure 9: Typing rules for machine states. 



Typable States Do Not Deadlock The main result of the type system, namely that *F h S and S — >* 5" 
implies 5' not deadlocked, follows from Subject Reduction and from Typable States Are Not Deadlocked, 
in a conventional manner. 

Lemma 2 (Substitution Lemma). If^,X: (A u A 2 );T, A hi and W(X )' = (Ai,A 2 ), then^Ta ,Aa hla, 
where o = [X'/X]. 

Theorem 3 (Subject Reduction). If^hSandS-t S', then h S', where = ¥ or^' = x P,l: (f ) A 
(with I fresh) or W = ¥,/: (X) X ,X: (Ay, A 2 ) (with I, X fresh). 

Proof. (Outline) By induction on the derivation of S — > S' proceeding by case analysis on the last 
rule of the derivation, using the substitution lemma for rules R-SCHEDULE, R-FORK, R-JUMP, and 
R-BRANCHT, as well as weakening in several rules. □ 



Theorem 4 (Typable States Are Not Deadlocked). If *P h S, then S is not deadlocked. 

Proof. (Sketch) Consider the contra-positive and show that deadlocked states are not typable. With- 
out loss of generality suppose that S is of the form (H; (t do ,. . . ,t dm );{d m+ \ : p dm+l , ... ,d n : p d „}) with 
suspended threads t di and processors p dj not necessarily distinct. 

Each of these threads and processors are trying to enter a critical region. For a processor p dj we have 
that S — >* d . S' where <i,-th processor in S' is of the form (R; A; (if r = jump v; _)) and R(r) = . 
By Subject Reduction x ¥' h S' where \F' extends *P as stated in Theorem [3] A simple derivation stalling 
from rule T-CRITICAL allows to conclude that *P h A di -< X dj+l . For a thread t di = (l[X'],R) in thread pool 
of S we run the machine 5" obtained from S by replacing processor 1 with (R;A;I) [X'/X], where H(l) = 
V[A : _].(_ requires A){/}. Proceeding as for processes above, we conclude again that *P h A dj -< X dj+1 . 

We thus have *F h A do -< X dl , . . h A dn l -< X dn , *F h A dn -< X d() which is not satisfiable. □ 
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5 Type Inference 

Annotating lock ordering on large assembly programs may not be an easy task. In our setting, program- 
mers (compilers, more often) produce annotation free programs such as the one in Figure [T] and use an 
inference algorithm to provide for the missing annotations. 

The Algorithm The annotation-free syntax is obtained from that in Figured by removing the : (A, A) 
part both in the newLock instruction and in the universal type. Given an annotation-free program H, 
algorithm W produces a pair, comprising a typing environment *P and an annotated program H*, such 
that *P h H*, or else fails. In the former case H* is typable, hence does not deadlock (Theorem©; in the 
latter case, there is no possible labeling for H. 

We depend on a set of variables over permissions (sets of locks), ranged over by V, disjoint from the 
set of heap labels and from the set singleton lock types introduced in Section|2l Constraints are computed 
by an intermediate step in our algorithm. 

Definition 5 (Constraints and solutions). 

• We consider constraints of three distinct forms: A -< X, V -< X, and X -< V, and denote by C a set 
of constraints; 

• A substitution G is a map from permission variables V to permissions A; 

• A substitution G solves (*P,C) if^G h xG -< yG for all x ~<y € C. 

Algorithm W runs in two phases: the first, produces a triple comprising a typing environment *P, 
an annotated program H*, and a collection of constraints C, all containing variables over permissions A. 
The set of constraints is then passed to a constraint solver, that either produces a substitution G or fails. 
In the former case, the output of W is the pair ( x i , G,H*G); in the latter W fails. In practice, we do not 
need to generate H* or to perform the substitutions; our compiler accepts H if the produced collection of 
constraints is solvable, and rejects it otherwise. 

Generating constraints Algorithm si ', described in Figure \\0\ visits the program twice. On a first step 
it builds an initial type environment *Fo = {/,•: tf}iei collecting the types for all code blocks in the given 
program {/,•: T; {/,}},£/, annotating with permission variables (denoted by v and p) the intervals for the 
locks bound in forall types; on a second visit it generates the constraints and the annotated syntax for the 
instructions in each code block. 

The algorithm for instructions, <# , also shown in Figure [lOl generates annotations for the singleton 
lock type introduced in newLock instructions, or further constraints in the case of the jump-to-critical 
instruction. In the case of a fork instruction, the algorithm calls function 'f to obtain the required 
permission A' and passes the difference A \ A' to the function that annotates the continuation /. 

The algorithm for values, "V , generates constraints in the case of type application. Finally, the al- 
gorithm for types annotates the singleton lock types in forall types. In the definition of all algorithms, 
permission- variables v,p,Vi,pi are freshly introduced. 

Example For the running example, we first rename all bound variables so that the type of code block 
liftLeftFork mentions lj and that of MftLeftFork mentions l 2 and m 2 , and that of eat uses 13 and m3. For 
example: 

liftRightFork V[l 2 ].V[m 2 ]-(ri:{l2> 12 , r 2 :(m 2 ) m2 ) requires {l 2 } 
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sf{{l r . Ti{Ii}}iei) = (U (e /^,{//: T*{If}} ieI MaQ) 
where = V[A,-: (v^p,-)]-^ requires A,) = ^(t ; ) 

and {fi,V u Ci) = S{!i,{li' T*}i € /U{X,-: (Vi,p;)},r,-,A,-) 
S{{k,r:= newLock;/),»F,r,A) = ((A : (v,p),r := newLock;/*),¥',C) 

where (7*,V,C) = W {A : (v,p)},r{r: (A) A },A) 

^((if r = jump v;/),V,r,A) = ((if r = jump v;f),W,Ci UC 2 U {A -< A}) 

where (r' requires (AW{A}),d) = r(v,V,r) 

and (r,»F , ,c 2 ) = ^(/,^,r,A) 

and^hr<:r' 

and A = T(r) 
.✓((fork v;/),V,r,A) = ((fork v;/*),¥',Ci UC 2 ) 

where (r' requires A', d) = r(v,V,r) 

and (r,^,C 2 ) = ^(/,V,r,A\A') 

and*Fhr<:r' 
r(v[A],^,r) = (t[A/A'],CU {v -< A -< p}) 

where (V[A': (v,p)]r,C) = r(v,V,r) 
^(V[A].((r,: Ti,...,r„: t„) requires A)) = V[A : (v,p)].((r,: ^(Ti),..,r„: ^(t„)) requires A) 

Figure 10: The tagging algorithm (selected rules). 

Then, algorithm s/ creates an initial environment *Fo by generating twelve variables (pi to pi 2 ) to 
annotate the six locks (I; and m;) in the three code blocks that mention locks (MftLeftFork, NftRightFork, 
and eat). They are li : (pi,p 2 ), • • • , m3 : (pn,pi 2 ). Revisiting the signature of code block liftRightFork, 
we get: 

liftRightFork V[m 2 ::(p 7 ,p8)].V[l2::(p5,p6)].(ri:(l2} 12 , r 2 :(m 2 ) m2 ) requires {l 2 } 

In the second pass, while in code block main, algorithm generates six more permission variables 
(Pi3 to pis) to annotate the new lock variables fi to f 3 introduced with the newLock instructions. They 
are: f i : (pi3,pi4) . . . f3 : (pi7,Pi8)- The rest of the second pass generates new constraints in type applica- 
tion and in jump-to-critical instructions. For example, in code block liftRightFork, and for value eat[l 2 ,m 2 ], 
four constraints are generated: pg -< l 2 -< pin,Pn < m 2 < Pi 2 . Then, in the jump-to-critical instruction, 
if r 3 = jump eat[l 2 ,m 2 ], and since the thread holds lock l 2 (as witnessed by its signature requires (l 2 )), a 
new constraint {l 2 } -< m 2 is generated. The thus created set of constraints is then passed to a constraint 
solver, which is bound to fail. 

Main result For soundness we start with a few lemmas. 

Lemma 6 (Value soundness). If "f '(v, V ',r) = (t,C) and d solves Q¥,C) then ¥0;re h v: T0. 

Proof. (Outline) The proof proceeds by induction on the inference tree for ^d 'XO \~ v: TO performing 
case analysis on the last typing rule applied. □ 



Vasconcelos, Martins, and Cogumbreiro 



107 



Lemma 7 (Instruction soundness). IfJf^^^Y.K) = (I* *¥\C) and d solves (H",C) then *F d\Td\Kd h 
1*6 and W C 

Proof. (Outline) The proof proceeds by induction on /. The cases for conditional jump and fork use 
Lemma [6] □ 

Theorem 8 (Soundness). IfW(H) = then ¥ h H*. 

Proof. (Outline) Follows directly from Lemma [7] using typing rules for heap values and heaps. We use 
weakening on typing environments before applying the heap rule. □ 

Conversely, we believe that if *P h H*, then W{S(H*)) does not fail, where $ is the obvious lock- 
order annotation erasure function. A stronger result would include a notion of principal solutions. 

6 Related Work and Conclusion 

Related work The literature on type systems for deadlock freedom in lock-based languages is vast; 
space restrictions prohibit a general survey. We however believe that the problem of type inference for 
deadlock freedom in lock-based languages has been given not so much attention in high-level languages, 
let alone low-level (assembly) languages. Three characteristics separate our work from most proposals 
on the topic: the non block structure of the locking primitives, the facts that threads never block and that 
they may be suspended while holding locks. 

Following Coffman et al. one can classify the problem of deadlock under the categories of detection 
and recovery, avoidance and prevention 0[9l. In the first category, detection and recovery, on finds for 
example works that check deadlocks at runtime. Cunningham et al. infer locks for atomicity in an object- 
oriented language, but use a runtime mechanism to detect when a thread's lock acquisition would cause 
a deadlock ifTTTl . Java PathFinder Q and Driver Verifier [3] identify violations of the lock discipline 
during runtime tests. Agarwal et al. H][2j present an algorithm that detects potential deadlocks involving 
any number of threads. 

Under the avoidance category on finds, e.g., a recent work by Boudol where a type and effect system 
allows for the design of an operational semantics that refuses to lock a pointer whenever it anticipates to 
take a pointer that is held by another thread (21 . 

Our work falls into the third category above, prevention. Flanagan and Abadi present a functional 
language with mutable references where locking is block structured and threads physically block |[T2Tl . 
From this work we borrowed the idea of singleton lock types to describe, at the type level, a single lock. 
Type based deadlock prevention has also been study in the realm of object-oriented languages, where, 
e.g., Boyapati et al. use a variant of ownership types for preventing deadlocks in Java, performing partial 
inference of annotations, but not of those related to lock order |6j. 

Suenaga proposes a concurrent functional language similar to Flanagan and Abadi's mentioned 
above, except that it features non block structured locking lfl6l : his language includes separate prim- 
itives for locking/unlocking, as in our case. Albeit targeting at different level of abstraction, the results 
(deadlock prevention) and the techniques (type inference) in both works are similar, in particular the 
usage of a constraint-based algorithm to infer types. Differently from our case, Suenaga uses ownership 
types rather than singleton lock types. 
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Concluding remarks We have presented a type system that enforces a strict partial order on lock 
acquisition, guaranteeing that well typed programs do not deadlock. Towards this end we extended 
the syntax of our language to incorporate annotations on the locking order. Acknowledging that the 
annotation of large assembly programs (either manually or as the result of a compilation process) is not 
plausible, we have introduced an algorithm that infers the required annotations. The algorithm is proved 
to be correct, hence that programs that pass our compiler are exempt from deadlocks. 

The current implementation of the algorithm generates, from a non-annotated program, a set of con- 
straints in the form of a Prolog goal. The goal is then checked against a Prolog program that implements 
the -< relation in Figure [5] We consider the program typable if the goal succeeds. There is no point in 
building the annotated syntax or performing the substitution, as explained in Section [5] Future work in 
this area includes the automation of the whole process either by calling the Prolog interpreter from within 
the compiler, or by implementing relation -< directly in Java, the language of our type checker/interpreter. 

Future work also includes trying to assess the usage of our type checker on larger programs, generated 
for example from an imperative high-level language, and to further compare the singleton lock types and 
the ownership types approaches for the description of non block structured locking. 
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